-
Notifications
You must be signed in to change notification settings - Fork 711
feat(streaming): arrangement backfill #10266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #10266 +/- ##
==========================================
- Coverage 70.34% 70.26% -0.09%
==========================================
Files 1274 1276 +2
Lines 219041 219308 +267
==========================================
+ Hits 154085 154097 +12
- Misses 64956 65211 +255
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 7 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
83c4e01 to
16ece5b
Compare
|
Have updated the implementation, it should match the proposal from @hzxa21 now: #10266 (comment). Will leave some comments on certain parts which may need further review. Will do a second pass tomorrow as well. |
|
The new |
hzxa21
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done. Thanks for the PR! Left some comments.
Rest LGTM!
| debug_assert!(!state_table.vnode_bitmap().is_empty()); | ||
| let vnodes = state_table.vnodes().iter_vnodes(); | ||
| let mut result = Vec::with_capacity(state_table.vnodes().len()); | ||
| for vnode in vnodes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since get_progress_per_vnode is called when we need to init the backfill state, cache of state table is highly likely to be empty and each get_row will be an actual S3 I/O. How about using futures::future::try_join_all to issue state_table.get_row concurrently to reduce overall latency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed rust compiler be able perform each get_row concurrently, even within a for loop. Guess I should build a small toy program to test it out.
I will follow your suggestion to be safe.
EDIT: Thinking about it again, I guess if we leave it to rust runtime, it may not schedule all get_row concurrently, e.g. stagger them? So try_join_all will enforce it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 9cc6ad2
| .iter() | ||
| .all(|(_, p)| *p == BackfillProgressPerVnode::Completed); | ||
| if is_completely_finished { | ||
| assert!(!first_barrier.is_newly_added(self.actor_id)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this also true when is_completely_finished == false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If is_completely_finished == false, this assertion can be false as well.
For instance if all progress is BackfillProgressPerVnode::NotStarted. Which could mean it is from new streaming graph, and this assertion will be false.
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
RFC: https://www.notion.so/risingwave-labs/Arrangement-Backfill-31d4d0044726451c9cc77dc84a0f8bb2
We implement the arrangement backfill executor in this PR.
Integration tests are difficult to write for it, since
CreateMviewProgressneeds to be mocked as well, which does not seem trivial to me.Since this is not used yet, I think it is fine to just do a code review of the corresponding changes to the code base before merging. I will defer frontend changes to later to avoid too large a PR.
It will be tested later when frontend support is added too. When that happens I will test it e2e with existing e2e test + adding some scale and recovery tests specific to arrangement backfill.
It will also be benchmarked against backfill then.
Comparison to
backfill:StateTableis needed for replication. So it usesStateTableinstead ofStorageTable. Later frontend will need to take care of instantiating the table, making sure it is of typeis_replicated. We should add an assertion forArrangementBackfill::newwhen that's done.StateTable, this also means we have to break out of the backfill stream loop first. This is so we drop the immutable reference toStateTable, which is used to read snapshot.process_barrier) does not seem ideal as well, since it covers many variables >= 10.Changes to backfill state persistence:
Changes to snapshot read:
Changes to update processing (cr @hzxa21):
Other refactoring:
backfill.rs->backfill/no_shuffle_backfill.rsbackfill/utils.rsHere are further changes to existing interfaces (OUTDATED, IGNORE THIS SECTION):
StorageTable, which already supported our read pattern). Our read pattern is different, so we need additional interfaces.iter_all_vnodes_with_pk_range.merge_sortimplementation frombatch. Ideally I would like to reuse the originalmerge_sortthere, but there are some difficulties as described in the docstring.chunk, viacollect_data_chunk. I managed to refactor it so it could be reused, but can't quite keep this method in a trait, due to orphan instance rules.TableIterinterface, since several locations use itsnext_rowmethod still. It can be removed next time.Regarding persisting arrangement backfill state:
backfillused to persist the state.arrangement_backfillis implemented e2e, I will remove it.Checklist For Contributors
./risedev check(or alias,./risedev c)Checklist For Reviewers
Documentation
Click here for Documentation
Types of user-facing changes
Please keep the types that apply to your changes, and remove the others.
Release note